K-Medoids For K-Means Seeding
نویسندگان
چکیده
We run experiments showing that algorithm clarans (Ng et al., 2005) finds better Kmedoids solutions than the standard algorithm. This finding, along with the similarity between the standard K-medoids and K-means algorithms, suggests that clarans may be an effective K-means initializer. We show that this is the case, with clarans outperforming other popular seeding algorithms on 23/23 datasets with a mean decrease over k-means++ (Arthur and Vassilvitskii, 2007) of 30% for initialization mse and 3% for final mse. We describe how the complexity and runtime of clarans can be improved, making it a viable initialization scheme for large datasets.
منابع مشابه
A simple and fast algorithm for k medoids clustering pdf
This paper proposes a new algorithm for K-medoids clustering which runs like the K-means algorithm and tests several methods for selecting.This paper proposes a new algorithm for K-medoids clustering which runs like the. A new Kmedoids clustering method that should be fast and efficient.
متن کاملA K-means-like Algorithm for K-medoids Clustering
Clustering analysis is a descriptive task that seeks to identify homogeneous groups of objects based on the values of their attributes. This paper proposes a new algorithm for K-medoids clustering which runs like the K-means algorithm and tests several methods for selecting initial medoids. The proposed algorithm calculates the distance matrix once and uses it for finding new medoids at every i...
متن کاملAnalysis and Implementation of Modified K-medoids Algorithm to Increase Scalability and Efficiency for Large Dataset
Clustering plays a vital role in research area in the field of data mining. Clustering is a process of partitioning a set of data in a meaningful sub classes called clusters. It helps users to understand the natural grouping of cluster from the data set. It is unsupervised classification that means it has no predefined classes. Applications of cluster analysis are Economic Science, Document cla...
متن کاملA simple and fast algorithm for K-medoids clustering
This paper proposes a new algorithm for K-medoids clustering which runs like the K-means algorithm and tests several methods for selecting initial medoids. The proposed algorithm calculates the distance matrix once and uses it for finding new medoids at every iterative step. To evaluate the proposed algorithm, we use some real and artificial data sets and compare with the results of other algor...
متن کاملDAGger: Clustering Correlated Uncertain Data
◮ It can compute exact and approximate probabilities with error guarantees for the clustering output. State-of-the-art techniques (e.g. UK-means, UKmedoids, MMVar): ◮ do not support the possible worlds semantics, ◮ lack support for correlations and assume probabilistic independence, ◮ use deterministic cluster medoids or expected means, and ◮ can only compute clustering based on expected distan...
متن کامل